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ABSTRACT 

This  paper  is  a  tutorial  discussion  of  identification 
problems.  The  basic  principles  of  one  method  of  identification 
are  discussed.  The  method  of  quasilinearization  has  been 
developed  by  Richard  Bellman  and  Robert  Kalaba  and  others. 

This  method  is  discussed  in  some  detail  with  emphasis  on 
programming  strategies.  These  strategies  are  intended  to 
improve  the  utility  of  a  general  quasilinearization  program. 
Specific  strategies  are  concerned  with  expansion  of  convergence 
space,  superposition  of  particular  solutions,  and  convergence 


criteria . 
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Introduction 

System  identification  is  that  branch  of  the  physical 
sciences  in  which  we  answer  either  of  the  two  closely  related 
questions : 

(1)  Given  enough  data  to  completely  define  an  existing 
system,  what  is  the  system  (generally,  what  are  the 
parameters  of  the  differential  equation (s)  that 
will  yield  the  observed  solution)? 

Or 

(2)  Given  a  desired  response  of  an  embryonic  system, 
what  are  the  proper  system  parameters  that  will 
satisfy  the  desired  response  in  a  best  fit  sense 
and  simultaneously  minimize  a  cost  function  that 
may  take  on  a  very  general  form? 

We  will  restrict  ourselves  to  problems  governed  by  dif¬ 
ferential  equations.  Those  problems  governed  by  algebraic 
equations  are  satisfactorily  covered  in  many  elementary  texts. 
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The  Simplest  Identification  Problem 


From  the  introduction,  it  is  obvious  that  the  simplest 
nontrivial  identification  problem  will  be  governed  by  a  linear 
first  order  homogeneous  differential  equation. 

We  will  then  assume 
y  =  ay 
subject  to 
y(tx)  =  yx 

y(t2)  =  y2 


(1) 


t>0 


(2) 


t2>tli0 


The  solution  of  the  differential  equation  (2)  is 

v  =  v  Pat 

y  yQe 


(3) 


We  now  have  two  unknowns,  namely  yQ  and  a.  To  determine  these 


we  write  the  following  set  of  nonlinear  algebraic  equations 
rl 


at, 

yi  =  yQe  i 


(4) 


at- 

y2  =  yoe  2 


Since  this  problem  is  exceedingly  simple,  we  can  solve  for  the 
unknowns 

ln(y1/y2) 


a  =  - 


ln(y,/y2) 

yo  =  yl  e  {  -lVt2/t"  } 


(5) 


eii-sU. 


2 


The  conclusion  can  now  be  reached  that,  even  in  the  simplest 
form,  identification  problems  are  basically  nonlinear.  The  non¬ 
linearity  is  there  in  spite  of  the  apparent  linearity  of  the 
original  differential  equations. 

We  now  take  a  different  view,  that  is,  the  original  dif¬ 
ferential  equation  is  nonlinear.  If  we  rationalize  that  "a"  is 
a  function  of  "t"  and  is  an  unknown  function  of  "t"  which  is 
governed  by  a  simultaneous  differential  euqation,  then  we  get 
y  =  ay 

(6) 

a  =  o 

Actually,  we  have  now  cast  our  system  into  its  true  perspective. 

We  have  two  data  points  and  two  simultaneous  first  order  non¬ 
linear  differential  equations.  That  is  what  we  will  usually 
expect. 

The  differential  equation  for  "a"  obviously  states  that 
it  doesn't  change  with  respect  to  "t". 

II 

Another  simple  Identification  Problem 

Let's  take  a  different  look  at  the  simple  harmonic  motion 
problem.  The  governing  differential  equation  is 

x  +  £x  =  0  t^O  (7) 

The  necessary  boundary  conditions  will  be  taken  as 


x(tx) 

x(t2> 

=  X1 

X2 

t3>t2>tl^° 

(8) 

x(t3) 

=  X3 

3 


The  ordering  for  will  be  assumed  hereafter  and  is  arbitrary. 
Denoting  </X~  =  oj  we  are  quickly  led  to 

x 

x.  =  x  cos  (cat.)  +  —  sin(wt.) 

1  o  1  0)  1 

X 

x_  =  x  cos(u)t„)  +  —  sin(wt«)  (9) 

Lot  u  L 

x,  =  x  cos  ( ait +  —  sin  (cat.,) 

3  o  3  co  3 

These  three  nonlinear  algebraic  equations  can  be  solved  by  the 
use  of  the  Newton  Raphson  algorithm  or  some  other  suitable 
algorithm.  The  three  unknowns  are  obviously  xq,  xQ,  and  £. 

(Or  in  the  form  shown  (/IT-  =  ca)). 

Another  set  of  boundary  conditions  is  of  course  possible. 

It  is  possible  for  boundary  conditions  to  be  on  the  derivative  (s) 
of  the  functions.  Taking  the  boundary  conditions  to  be 
x(t^)  =  x1 

x ( 1 2 )  =  x  2  (10) 

x(t3)  =  x  3 

Then  the  necessary  algebraic  equations  are 

•  • 

x^  -  _xow  sin(wt^)  +  xQ  cos(cat^) 

x-  =  -x  w  sin(oat0)  +  i  cos  (oat.,)  (11) 

to  to  L 

X 

x-  =  x^  cos(catQ)  +  —  sin(u)t-) 
jo  j  cj  j 

The  degree  of  difficulty  of  both  these  problems  is  essentially 
the  same.  The  solution  of  either  set  of  nonlinear  algebraic 
equations  requires  about  the  same  number  of  calculations, 
strategies,  and  programming  skill. 
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In  a  manner  similar  to  the  earlier  one,  we  can  also  formu 
late  this  problem  as  a  set  of  nonlinear  differential  equations 


x  +  £x  =  0 

i  =  0 


(12) 


We  will  find  it  convenient  later  to  always  write  the  set  of 
differential  equations  as  a  set  of  first  order  equations.  Thus, 
with  the  definition  stated  in  the  first  of  the  following 
equations  we  have 
x  =  z 

z  =  -x£  (13) 

l  =  0 


If  we  attempt  a  slight  variation  of  the  preceeding  problem 
and  add  a  damping  term,  the  problem  will  become  seemingly  formid¬ 
able  in  an  analytical  identification  sense.  The  governing 
equation  is 

x  +  yx  +  £x  =  0  (14) 

As  you  might  guess,  if  the  damping  and  frequency  terms  are  un¬ 
known,  we  automatically  write 
x  =  z 


z  =  -y  z  -  £x 


(15) 


l  =  0 

y  =  0 


The  next  four  topics  of  discussion  will  be  background  to 
aid  in  solving  the  nonlinear  multipoint  boundary  value  problems 
we  have  been  formulating. 


Ill 

The  Newton  Raphson  Method 

We  are  now  going  to  discuss  a  generalization  of  Newton's 
method  of  tangents  and  later  we  will  generalize  this  a  bit  more. 

First,  let's  review  the  method  of  tangents  and  point  out 
the  properties  of  the  generalized  methods  that  are  desirable 
and  difficult  to  prove. 

Consider  the  transcendental  function  cos(x)  and  assume  that 
we  wish  to  find  its  first  zero,  and  that  from  some  obscure 
"a  priori"  knowledge  we  find  that  the  solution  is  near  one 
radian.  We  expand  our  equation 

f  (x)  =  cos (x)  =  0  (16) 

in  a  Taylor  series  about  the  approximate  root  xa 

(x-x  )  (x-x  ) 

f  (X)  =  f  (x  )  +  f'  (x  )  i'  i  ~  +  f”  (X  )  9,a  +...  (17) 

2 

Neglecting  the  terms  of  order  (x-x  )  and  higher  we  get  the 

a 

approximation 

f  (xj  +  f '  (x  )  (x-xj  :  0  (18) 

a  d  a 

Then 

x-xa  =  -  f  (xa)/f '  (xa) 

x  *  x  +  Axa  (19) 

a  a 

Ax  =  -  f  (x  )  / f  '  (x  ) 

d  da 

The  indicated  algorithm  is  generally  written  as 

*n+l  -  xn  -  f(jtn)/f'<xn>  (20> 


The  last  equation  states  what  is  called  the  Newton  Raphson 
algorithm  for  a  one  dimensional  problem. 

The  algorithm  for  the  zero  of  the  cosine  function  indicated 
does  reduce  to 

x.,  -  X  -(-cot(x  ))  *  x  +  cot(x  )  (21) 

n+i  n  n  n  n 

For  the  initial  condition  mentioned,  Table  I  shows  the  convergence 

on  the  true  answer  which  we  already  knew  to  be  n/2. 


TABLE  I 


The  table  was  constructed  by  interpolating  the  values  given  in 
the  CRC  Handbook. 

Notice  that  in  the  early  iterations  that  there  was  an 
oscillation  about  the  true  answer.  However,  as  the  final 
solution  was  approached,  the  convergence  was  monotone.  R.  Kalaba 
has  proven  that  often  the  method  gives  quadratic  and  monotone 
convergence  in  the  limiting  iterations.  An  approximate  inter¬ 
pretation  of  quadratic  convergence  is  that  at  each  iteration 
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the  number  of  significant  figures  is  doubled.  Of  course,  that 
implies  that  an  extremely  high  number  of  significant  figures 
must  be  used  in  the  calculations.  Because  of  the  "finiteness" 
of  the  tables  used,  the  quadratic  convergence  property  is  not 
illustrated  by  the  above  table.  It  will  also  be  shown  by  the 
following  refinement  that  the  monotone  behavior  is  accidental. 

The  results  of  solving  the  same  problem,  but  having  it 
done  on  a  digital  computer  using  16+  significant  figures  in  each 
calculation  are  shown  in  Table  II.  Notice  that  the  quadratic 
convergence  is  illustrated  by  this  data.  An  obvious  question  is 

TABLE  II 

Improved  Newton  Raphson  Data 


n 

Value  of  x  Eq. (21) 
n 

0 

ljOOOOOOOOOOOOOOO 

1 

1. 6)42092615934  331 

2 

1.57C|675277161251 

3 

1.570796326795)488 

4 

1.570796326794897 

5 

1.570796326794897 

what  is  the  accuracy  of  the  sine  and  cosine  subroutines?  The 
final  answer  is  accurate  to  sixteen  significant  decimal  digits. 

Also  notice  that  there  is  an  oscillation  about  the  answer  again. 

An  interesting  variation  is  to  apply  the  algorithm  to  a 
problem  involving  complex  variables.  In  approaching  the  solution 
of  the  biharmonic  equation  where  two  dimensional  Cartesian  co¬ 
ordinates  with  at  least  one  of  the  coordinates  having  finite  limits 
via  Laplace  or  Fourier  transforms,  it  is  often  necessary  to 


know  the  zeroes  of  the  transcendental. 

sin(Z)  +  Z  ■  0  (22) 

The  Newton  Raphson  algorithm  for  this  problem  becomes 

Z  . .  =  Z  -  (sin (Z  )  +  Z  )/U  +  cos(Zj)  (23) 

n+1  n  n  n  n 

Obviously  the  problem  now  is  arithmetical  except  for  the 
question  of  initial  guesses.  Let's  assume  that  a  little  bird 
flew  past  and  said,  "Awkl,  the  first  non  zero  root  is  near 
4  +  i2  and  each  successive  root  will  be  approximately  2n 
further  out  the  real  axis."  The  name  of  that  little  bird  is 
experience. 

Table  III  is  a  list  of  the  values  of  Z  that  will  be 
generated  in  obtaining  the  first  five  non  zero  roots  with  the 
aid  of  the  little  bird's  information.  The  improvement  of  an 
answer  is  stopped  whenever  the  modulus  of  the  last  correction 
is  less  than  10  5. 

It  is  obvious  that  the  convergence  is  rapid  for  the  above 
problem.  The  interpretation  of  "doubling  the  number  of  signi¬ 
ficant  figures"  with  each  iteration  now  raises  a  question.  Which 
significant  figures  are  doubled,  if  any?  A  rule  that  sometimes 
applies  is,  "the  largest  element's  significant  figures  are  doubled". 

The  example  was  included  for  tutorial  purposes  only.  The 
most  efficient  method  of  solving  the  above  problem  includes 
asymptotic  studies  of  Eq.  (22) . 

Consider  the  problem  of  solving  a  set  of  nonlinear  algebraic 
equations  in  "N"  real  unknowns.  We  will  write  this  as  the  vector 
equation  ^  ^  _v 

R  (X)  =  0  (24) 

where  the  right  hand  side  is  the  null  vector. 
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TABLE  III 

Complex  Newton  Raphson  Data  (Eg.  (22) 


Root  Index 

n 

Real  (Z J 
n 

Imag .  ( Z  ) 
n 

1 

0 

4. 

2. 

1 

4 . 2793583E  00 

2.2714714 

2 

4 . 2143169E  00 

2.2488951 

3 

4 . 2123884E  00 

2.2507277 

4 

4 . 2123922E  00 

2.2507286 

2 

0 

1.0492392E  01 

2.2507286 

1 

1.0938640E  01 

3.6368255 

2 

1 . 0801822E  01 

3.2135148 

3 

1 . 0721907E  01 

3.1062368 

4 

1 . 0712570E  01 

3.1031115 

5 

1 . 0712537E  01 

3.1031487 

3 

0 

1.6992537E  01 

3.1031487 

1 

j  1 . 7109956E  01 

3.6695510 

2 

1 . 7077178E  01 

3.5574931 

3 

1. 7073389E  01 

3.5511021 

4 

1. 7073365E  01 

3.5510873 

4 

0 

2 . 3353365E  01 

3.5510873 

1 

2 . 3411799E  01 

3.9114766 

2 

2 . 3398998E  01 

3.8601203 

3 

2.3398356E  01 

3.8588097 

4 

2.3398355E  01 

3.8588090 

5 

0 

2 . 9678355E  01 

3.8588090 

1 

2 . 9714814E  01 

4.1235346 

2 

2 . 9708304E  01 

4.0941315 

3 

2.9708120E  01 

4.0937050 

4 

2 . 9708120E  01 

4.0937049 
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The  Newton  Raphson  approach  is  to  expand  the  function 
about  some  approximate  root  X  . 

Cl 

Then 

R(X  )  +  Si  (X)  {X  -  X  }  +  ...=  $(2)  =  $  (25) 

a  3X  a  a 

Hopefully  assuming  the  questioned  equality  and  neglecting  the 
indicated  higher  order  terms  yields 
_  3R{X  )  _1  __ 

X  =  $  -  { - - —  }  R(X  )  (26) 

a  /n  It  a 


or  in  terms  of  the  generally  expected  subscripts  the  algorithm 
is 


X  -  {  S|  }  R 

n  ax  n  n 


(27) 


The  derivative  quantity  is  a  square  matrix  of  order  "N" 
and  we  hope  also  of  rank  "N" .  In  general  it  is  and  we  seldom 
need  consider  the  case  where  it  might  be  singular  in  the 
neighborhood  of  the  solution  X. 

Let's  assume  the  problem  is  to  solve  the  equations 

X1  +  x2  +  x3  =  3 

xl2  +  3x2  +  4x3  =  8  (28) 


Then 


rl  =  X1  +  x2  +  x3  ‘  3 
2 

r2  =  x^  +  3x2  +  4x3  - 


I 


8 


(29) 
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The  square  matrix  or  as  we  will  call  it,  the  "Jacobian 
matrix"  for  this  problem  is 


j  - 

n  3* 


n 


1 

2x. 


1 

3 


2x2  -1 


(30) 


n 


Assuming  the  null  vector  for  the  first  approximation  XQ, 

-1 


*1  -  *o  +  <-Jo>  K 

Substituting 


(31) 


/ 

\-i 

-  — 

h 

-i  -i  \ 

-3 

-  $  + 

0 

-3  -4 

-8 

1° 

0  +1  1 

0 

(32) 


The  successive  solution  vectors,  as  calculated  on  a  7094,  are 
shown  in  Table  IV. 

TABLE  IV 

Elements  of  Solution  Vector  at  Each  Iteration 


n 

X1 

X2 

X3 

0 

0. 

0. 

0. 

1 

.33333333 

2.6666667 

1.5E-8 

2 

.72191529 

1.4825046 

.79558007 

3 

.94095068 

1.0736089 

.98544037 

4 

.99710697 

1.0026408 

1.0002522 

5 

.99999356 

1.0000044 

1.0000020 

6 

.99999999 

1.0000000 

1.0000000 
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The  elements  of  the  residue  vector,  are  shown  in  Table  V. 

TABLE  V 

Elements  of  Residue  Vector  at  Each  Iteration 


n 

rl 

r2 

r3 

0 

-3. 

-8. 

0. 

1 

1.49E-8 

1.11E-1 

7.11 

2 

1.49E-8 

1.51E-1 

1.40 

3 

7.45E-9 

4.80E-2 

1.67E-1 

4 

0. 

3.15E-3 

5.04E-3 

5 

4.47E-8 

8.26E-6 

6.93E-6 

6 

0. 

0. 

0. 

The  solution  shown  above  might  cause  questions  as  to  whether 
the  convergence  is  quadratic  and  monotone.  The  quadratic  question 
is  needless  because  the  convergence  is  obviously  fast.  The 
monotone  question  is  meaningful  at  this  time.  Monotone  conver¬ 
gence  in  the  one  dimensional  case  meant  that  the  solution  was 
always  approached  from  the  same  side.  In  other  words  AX  was 

always  of  the  same  sign.  The  generalization  of  this  is  that 
-*T  •+  .  , 

AX^AX^-1  is  positive  when  monotone  and  negative  when  oscillating. 
In  the  limiting  iterations,  the  above  was  indeed  monotone. 

Referring  to  the  not  quite  so  simple  identification  problem, 
we  will  write  (see  Eq.  (9)) 


R(x) 


2 

x1  cos(x3*0)  +  —  sin(x3  *0)  -  1.00000 

x2 

x1  cos  (x3 *0.5)  +  sin(x3  *0.5)  -  1.35701 

X2 

x  cos(x,*l.)  +  —  sin(x,*l.)  -  1.38177 
1  3  x3  3 


(33) 


13 


The  above  equations  are  for  the  harmonic  oscillator  having 
mit  initial  conditions  and  a  unit  circular  frequency  (x^  =  xQ  *  1, 
*2  *  x  =  1,  x3  =  u>  =  =  1)  solution  for  t  =  0,  0.5,  and  1.0 

are  then  used. 

Table  VI  is  a  list  of  the  vector  X  starting  from  an  initial 
guess  in  which  each  element  is  50%  the  final  solution. 

TABLE  VI 

Newton  Raphson  Data 


n 

X1 

X2 

X3 

0 

0.5 

0.5 

0.5 

1 

1.00000 

0.98410 

1.86895 

2 

1.00000 

0.74143 

0.92949 

3 

1.00000 

1.00173 

1.00862 

4 

1.00000 

0.99999 

1.00002 

5 

1.00000 

1.00000 

1.00000 

With  nonlinear  equations  such  as  these,  there  is  always  the 
possibility  of  finding  a  solution  that  isn't  desired.  Using  an 
initial  guess  vector  of  (0,  -2,2),  the  same  program  that  converged 
on  the  nice  answers  above  converged  upon  the  answer  (1,  -36.7,-36.7). 

Let's  now  consider  the  solution  of  a  function  to  be  optimized 
subject  to  a  set  of  equality  constraints.  For  the  simple  first 
example  we  want  to  consider  only  functions  whose  constraints  are 
linear  and  the  function  to  be  optimized  is  of  a  quadratic  norm 
type.  Assuming  a  two  dimensional  problem,  we  write  typically 
x  +  ax2  =  6 

2 

(yx2  +  6x2“0)  =  minimum 


(34) 
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to  x^ 


By  differentiating  the  function  to  be  minimized  with  respect 
,  we  get 

2y(yx1  +  5x2-6)  =  0  (35) 


Thus 


x^  +  ax2  =  8 

or 

YXj^  +  6x2  =  +  6 


(36) 


v  _  8*5  -  a8 

X1  "  Z  -~~ay 


S---B 1 

5  -  ay 


which  is  nice  as  long  as  6  f  ay  which  is  the  usual  case.  Before 
this  simple  problem  lulls  us  into  trouble,  let's  point  out  that 
the  equation  resulting  from  differentiating  with  respect  to  x2 
would  have  been 

26(yxx  +  6x2  -  9)  =  0  (37) 

which  is  equivalent  to  the  one  we  had. 

Considering  a  problem  in  three  or  more  dimensions  we  need 
to  be  careful.  Assume  a  constraint 


l  aiixi  -  =i  (38) 


If  we  assume  a  cost  function  to  be  minimized,  we  can  take  it  to 
be  of  the  form 

M  N  , 

E  (E  a.  .x.  -  c.r  =  cost  (39) 

j=2  i  31  1  3 

where  M  >  N  -  1. 

We  supposedly  can  differentiate  the  cost  function  with  respect  to 


each  element  of  it  and  obtain  N  -  1  or  N  linearly  independent 
equations ,  depending  on  M.  Because  there  can  be  only  N  linearly 
independent  equations ,  one  might  be  led  to  believe  that  any  N  -  1 
of  the  equations  will  give  the  same  solution  when  coupled  with  the 
constraint  equation  given  above.  That  isn't  so  because  the  right- 
hand-sides  can  be  picked  at  random.  Regardless,  the  best  solution 
will  probably  be  obtained  by  an  iterative  method. 

For  our  purposes  we  can  assume  that  the  above  least  squares 
type  approximation  is  adequate.  The  fact  that  each  of  the  equa¬ 
tions  from  minimizing  the  error  or  cost  term  involves  all  the 
data  points  is  responsible  for  this.  It  should  be  expected  that 
any  iterative  scheme  would  need  this  type  approximation  for  a 
starting  approximation  anyway. 


IV 

Function  Space 

This  section  will  be  very  incomplete.  It  is  included  for 
emphasis  of  a  point  that  would  require  great  labor  to  explain 
with  any  significant  rigor.  Function  space  is,  uh,  a  very 
nebulous  abstraction  when  an  attempt  is  made  to  visualize  it  in 
the  Cartesian  frame  of  mind  we  are  accustomed  to.  Some  dis¬ 
cussion  of  function  space  is  needed  because  we  are  going  to 
differentiate  expressions  with  respect  to  functions  of  variables 
and  not  with  respect  to  variables.  Of  course,  we  can  always 
rationalize  that  a  variable  is  simply  a  special  case  of  a  function. 
The  point  is  though,  we  are  to  be  differentiating  with  respect  to 
continuous  dependent  functions  of  an  independent  variable.  This 
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function  space  might  be  thought  to  have  independent  variables 
that  are  the  functions  of  the  true  independent  variable.  Another 
independent  variable  might  be  an  iteration  index  where  we  are 
continuously  updating  approximations  to  these  functions.  Think 
about  it???  A  discussion  of  function  space  is  in  Lanczos'  text 
entitled  Linear  Differential  Operators. 

Some  appreciation  of  function  space  is  necessary  for  complete 
confidence  in  performing  Newton-Raphson-Kantorovich  expansions  in 
function  space.  We'll  call  them  NRK  expansions. 

We  have  written  some  vector  differential  equations  earlier. 
Now,  we're  going  to  differentiate  these  vector  equations  with 
respect  to  some  of  the  functions  within.  The  equations  are  like 

y  =  ?  (y,t)  (40) 

The  "independent  variables"  in  this  function  space,  that  we're 
about  to  expand  this  equation  in,  are  the  elements  of  the  vectors 

y  and  y.  Now,  both  sides  of  the  equation  should  be  expanded 

with  respect  to  the  totality  of  y  and  y.  Howeverl  ,  the  LHS  is  a 

function  of  y  only  and  the  RHS  is  a  function  of  y  only.  This 
is  one  reason  why  we  always  write  the  equations  in  this  first 
order  form.  Let's  diverge  for  a  moment. 

Recall  that: 

The  derivative  of  a  vector  with  respect  to  another  vector 
is  a  matrix  or  maybe  it's  a  second  order  tensor.  If  both  the 
differentiated  and  differentiating  vector  are  the  same,  the 
matrix  is  the  unit  or  identity  matrix.  If  they  are  not  the  same, 
the  resulting  second  order  tensor  (?)  is  called  the  Jacobian 
Matrix.  The  second  derivative  of  a  vector  with  respect  to  another 
vector  gives  a  result  that  has  three  subscripts. 
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s&* 


Anyway 

y  =  I  (y,t)  (40) 

expanding  about  the  nth  iteration  where  yn  and  yn  are  known 


n 


+  4 

3y 


(yn  +  l  -  V  +  *  ' 


n 


=  ?  (yn,t)  + 

•y  j 


n 


(y  , ,  -  y  )  +  .  . 
Jn+1  1  r\ 


(41) 


We  have  neglected  r. !  1  the  higher  order  terns,  the  first  of  which 
on  the  right-hand  side  cannot  be  written  like 


n 


2 


(42) 


but  must  be  written  as 


n 


(43) 


where  the  order  in  which  the  matrix  multiplication  or  tensor 

contraction  must  be  performed  is  shown  by  the  parentheses. 

Recognizing  the  unit  matrix  or  identity  operator  resulting 
— ► 

from  differentiating  y  with  respect  to  itself  and  making  a  slight 

introduction  of  J  for  the  Jacobian  matrix  on  the  RHS ,  we  easily 

n  1 

write 


y 


n+1 


I 

n 


J 

n 


(44) 
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which  with  an  introduction  of  obvious  nomenclature  can  be  written 
(some  would  say,  "simplifies  to") 


^n+1  -  Jn  ^n+1  +  F 


n 


(45) 


Henceforth,  if  it  is  implied  that  NRK '  s  are  relevant,  we  auto¬ 
matically  transform  a  nasty  nonlinear  equation 

y  =  1  (y, t)  (40) 


into  the  simple 


^n+1  Jn  ^n+1  +  F 


n 


(45) 


which  is  linear  with  known  variable  coefficients  and  a  known 


forcing  function  F^.  An  obvious  problem  dealing  with  the  first 
assumption  for  yQ  will  be  dealt  with  later. 


V 

Numerical  Integration  of  Initial  Value  Problems 


The  wheel  has  suffered  through  more  re-inventions  in  the 
field  of  numerical  integration  than  any  sane  person  would  like 
to  count.  The  necessary  information  presented  here  is  covered 
in  great  detail,  often  with  so-called  "rigor",  in  no  less  than 
one-hundred-thirty-seven  and  one-half  good  books.  The  reason 
it  appears  here  is  for  the  sake  of  sameness  in  the  names 
applied  to  the  methods  used.  Occasionally  there  is  somebody 
from  a  "culturally  and  socially  deprived  area"  who  may  not  know 
about  these  techniques. 

I  assume  that  we  all  know  about  the  forward,  backward,  and 
central  difference  approximations  to  first  derivatives.  We  have 
been  writing  our  differential  equations  in  a  first  order  form 
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and  will  continue  to  do  so.  Because  of  this,  we  only  need  dif¬ 
ference  approximations  to  first  derivatives.  It  is  common  know¬ 
ledge  that  the  error  term  for  forward  and  backward  difference 

2 

approximations  is  of  order  h  while  the  order  is  h  for  central 
difference  expressions.  The  following  characteristics  of  these 
basic  approximations  should  be  recognized: 

(1)  Forward  difference  methods  0(h)  are  self-starting 
and  yield  explicit  formulae. 

(2)  Central  difference  methods  have  to  be  started  by 
other  means  but  do  yield  explicit  formulae. 

(3)  Backward  difference  methods  do  not  yield  explicit 
formulae  for  nonlinear  differential  equations. 

The  simple  forward  difference  methods  have  the  most 
advantages  but  they  also  have  error  problems,  relatively.  The 
recurrence  formulae  for  a  problem  like 

y  =  ?  (y /t)  (40) 

is 

yj+1  ?j  +  hi  (y-j/tj)  (46) 

From  the  known  initial  conditions  of  this  initial  value  problem, 
yQ  is  known  and  then  all  other  y's  can  be  calculated.  A  simple 
iterative  form  that  will  retain  all  desirable  characteristics  of 
the  forv/ard  difference  method  and  gain  in  accuracy  at  the  expense 
of  speed,  only,  is  called  a  predictor  corrector  method. 

The  predictor  corrector  method  is  easily  illustrated  by  the 
following  flow  diagram 


M3" 
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The  PC  method  above  nas  the  advantages  of  the  forward 

difference  method  outlined  previously,  an  error  of  ( j) *  and 

is  self-starting.  It  is  more  expensive  than  the  central  dif¬ 
ference  method  of  the  same  accuracy  in  computing  time.  However, 
its  self-starting  characteristic  is  usually  reward  enough  to 

offset  that  expense. 

We  will  have  need  to  integrate  equations  like  the  one 
shown  above  and  equations  like 

y  =  J(t)  y  +  F  (t )  (47) 

This  "new  type"  of  equation  is  simply  a  linear  differential 
equation  with  variable  coefficients,  namely  the  matrix  J(t)  a 
and  a  forcing  function  F(t).  When  this  is  a  deterministic 
form,  there  are  essentially  no  unanswered  questions  if  we  :  re 
competent  enough  such  that  the  previous  brief  discussion  on 
numerical  integration  was  very  boring.  However!,  we  are  not 
going  to  be  discussing  deterministic  forms.  The  matrix  J(t) 
and  the  vector  F(t)  will  be  functions  of  discrete  data  (actually, 
they  will  be  formed  from  the  discrete  solutions  of  y  =  £(y,t)). 

The  best  means  of  integrating  these  non-deterministic  pro¬ 
blems  is  an  open  question.  The  PC  method  outlined  above  is 
adequate.  The  values  of  J (t)  and  F(t)  used  will  be  at  the 
half-step  of  the  integration  step  under  consideration. 

Many  refinements  can  be  made  to  this  lacksadasial  procedure. 
However,  these  refinements  may  be  inconsistent  with  the  accuracy 
of  the  problems  in  mind.  For  these  reasons  and  others,  Runge- 
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Kutta  and  Adams-Moulton  techniques  are  often  not  used  for  the 
problems  in  mind.  For  problems  requiring  an  extreme  accuracy, 
there  are  generally  better  methods  than  the  RK  and  AM  schemes. 


VI 

Linear  Multipoint  Boundary  Value  Problems 
as  Initial  Value  Problems 


We  are  going  to  discuss  an  old,  old  method  for  solving  these 
problems.  It  is  frequently  used  in  analytic  studies,  but,  for 
some  reason  it  is  relatively  rare  in  numerical  integration  studies. 
Consider  a  linear  differential  equation  of  Nth  order 

y  =  J  y  +  ?  (48) 


that  is  subject  to  at  least  N  boundary  conditions.  It  is  elemen¬ 
tary  that  the  solution  may  be  written  in  terms  of  a  particular 
solution  plus  a  linear  combination  of  N  linearly  independent 
homogeneous  solutions.  The  particular  solution  is  the  solution  of 


y<0)  =  J  y<°>  +  F 


(49) 


(o) 


subject  to  a  finite  initial  condition  vector  y  (o) .  The  homo¬ 
geneous  solutions  are  solutions  of 


-00 

y  =  J 


(k) 


1  <  k  <  N  (50) 


subject  to  linearly  independent  finite  initial  conditions.  For 
the  initial  conditions  to  be  linearly  independent,  the  following 
must  hold; 


det 


t  0 


•  •  • 


(51) 
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This  is  a  generalization  of  what  is  commonly  referred  to  as  the 
Wronskian  monstrosity. 

Thus,  we  can  unhesitatingly  write 


+ 


N 

Z 

k=l 


ak  y 


(k) 


(52) 


where  the  combining  coefficients,  (a^) ,  must  be  determined  in 
such  a  manner  so  as  to  satisfy  the  boundary  conditions.  If  there 
are  N  boundary  conditions,  the  a's  will  be  unique.  If  there  are 
more  than  N  boundary  conditions,  the  a's  will  also  be  dependent 
upon  the  minimization  of  the  norm  chosen. 

This  method  has  the  obvious  advantage  of  accuracy  and  ef¬ 
ficiency  in  storage.  It  often  has  efficiency  in  execution  time 
too.  With  a  little  ingenuity,  about  all  that  has  to  be  stored 
in  an  (N+l)  by  (N+2)  matrix  for  the  solution  of  the  exactly 
determined  problem.  For  a  problem  with  P  boundary  conditions, 
a  (P+1)  by  (N+2)  array  must  be  stored  for  use  in  calculating 
the  a's. 

The  initial  conditions  of  the  particular  solution  should 
reflect  the  maximum  knowledge  of  the  true  initial  conditions. 

If  the  a's  are  all  zero,  then  the  initial  conditions  used  are 
the  true  initial  conditions  for  the  problem.  However,  the 
initial  conditions  of  the  homogeneous  solutions  have  many 
possibilities.  An  obvious  valid  choice  for  the  vectors  y 
1  <  k  <  N  is  the  columns  of  an  identity  matrix.  This  is  a  very 
simple  nonsingular  matrix  and  the  columns  (or  rows)  of  any  non¬ 
singular  matrix  satisfy  the  generalized  monstrosity  given  above. 


i 


► 
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However ,  practicality  and  generality  occasionally  wreck  the  simple 
plans  of  men.  I've  encountered  some  problems  in  which  it  is 
necessary  to  use  the  following  scheme.  Use  the  same  vector  that 
is  considered  apropos  for  the  particular  solutions  except  multiply 
the  kth  element  of  the  vector  by  some  scalar.  This  scalar  has 
few  restrictions  but  something  positive,  nonunity,  and  less  than 
two  seems  preferable.  Generally  1.1  to  1.5  seems  very  good.  A 
further  exception  is  that  if  the  kth  element  is  null,  use  some¬ 
thing  other  than  zero  for  that  element,  say  one. 

Although  the  one  shot  procedure  described  theoretically 
works,  it  can  be  rendered  useless  by  roundoff  error.  It  is 
therefore  a  good  procedure  to  make  it  a  repetitive  procedure 
such  that  we  are  seemingly  trying  to  find  the  initial  conditions 
of  the  problem.  When  we  have  obtained  the  true  initial  conditions, 
all  the  a's  go  to  zero  or  some  negligible  values. 

A  further  improvement  is  to  superimpose  particular  solutions 
of  Eq.  (48).  To  do  this,  we  write 


-► 

y 


n+l 

l 

k=l 


y 


(k) 


(53) 


where  the  elements  of  y  are  chosen  to  satisfy  the  boundary  condi¬ 
tions  and  the  auxiliary  condition 
n+l 

l  gv  =  1  (54) 

k=l  * 

This  auxiliary  condition  obviously  states  that  in  the  superposition, 
the  forcing  functions  must  sum  to  be  only  F(t).  This  strategy 
does  add  greatly  to  the  control  of  roundoff  error  in  the  super- 
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position  of  solutions.  The  initial  conditions  used  are  those 
previously  discussed,  except,  the  solution  y^  is  now  y^n+^. 

If  the  true  initial  conditions  are  used  for  y^n+* 

9n+l  *  1 

(55) 

9j=0  l<j<n 

VII 

Quasi linearization 

We  will  now  attempt  to  tie  all  these  things  together  to 
attempt  to  solve  the  identification  problems  mentioned  pre¬ 
viously.  We  wish  to  solve  problems  that  have  nonlinear  solu¬ 
tions.  The  governing  differential  equations  are  often  linear, 
but,  the  problem  is  nonlinear  because  as  shown  in  the  simplest 
identification  problem,  the  desired  answers  are  not  linear  combi¬ 
nations  of  the  given  data.  We  have  discussed  Newton  Raphson 
techniques  whereby  we  could  solve  nonlinear  problems  by  successive 
linear  steps.  We  have  also  discussed  one  of  many  methods  for 
solving  linear  multipoint  boundary  value  problems. 

My  guess  as  to  what  is  quasilinearization  is  that  it  is 
the  totality  of  the  following  computational  schemes  and  portions 
of  computational  schemes: 

1)  Formulate  the  differential  equations  in  the  first 
order  coupled  form  as  shown  earlier. 

2)  If  there  are  unknown  parameters,  supplement  the 
above  differential  equations  with  the  null  dif¬ 
ferential  equations  described  earlier. 
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3)  If  there  are  more  data  points  than  unknowns,  select 
a  norm  which  will  be  used  as  a  "goodness  of  fit" 
measure . 

4)  Do  a  Newton-Raphson-Kantorovich  expansion  of  the 
nonlinear  equations  to  obtain  a  related  set  of 
linear  differential  equations. 

5)  Solve  the  set  of  linear  variable  coefficient  dif¬ 
ferential  equations  subject  to  the  boundary  condi¬ 
tions  that  may  be  exact  and/or  best  fit.  This 
solution  will  generally  be  obtained  as  a  sequence 
of  initial  value  problems  to  have  the  highest 
reasonable  degree  of  accuracy.  The  procedure 

in  this  step  is  repeated  until  the  initial  condi¬ 
tions  are  known  for  the  nonlinear  differential 
equations  such  that  the  solutions  generated  for 
these  initial  conditions  satisfy  the  ^iven  boundary 
conditions.  If  the  problem  is  not  exactly  determined, 
the  question  arises  as  to  what  does  satisfy  mean.? 

This  question  has  to  be  answered  for  each  problem. 

VIII 

Applications 

The  applications  of  such  methods  are  ubiquitous.  The 
essence  of  engineering  design  might  be  said  to  be,  "select  a 
system  governed  by  the  laws  of  physics  and  that  benefits  mankind 
with  economics  as  a  constraint".  Generally  in  engineering  design 
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I 


the  benefit  to  mankind  is  established  before  we  get  the  problem. 
Thus,  we  have  only  to  formulate  the  laws  of  physics,  which  for 
simple  mechanical  problems  is  simply  Newton's  law,  and  pick 
the  parameters  that  make  it  economical.  Let's  mention  a  few 
examples : 

1)  For  some  unknown  reason,  it  is  beneficial  to  mankind 
to  fly  out  in  space.  A  space  vehicle  is  attracted  by 
the  earth's  gravitational  field.  There  is  a  potential 
function  whose  gradient  is  the  acceleration  due  to 
gravity.  The  acceleration  due  to  gravity  is  not  a 
constant  for  a  given  radial  distance  from  the  earth's 
center.  Identification  of  the  Fourier  coefficients 

of  an  eigenfunction  expansion  of  this  potential  should 
be  possible  via  the  means  we  have  described. 

2)  A  control  system  that  performs  a  very  simple  function 

like  raising  a  bulldozer  blade  is  very  nonlinear  and 

might  be  represented  by 

.  •  *2  *2 
x  +  a-^  x  +  a2*  +  a^x  x  +  a^xx  =  a^  u(t) 

The  a's  are  functions  of  blade  weight,  pump  and  pipe 
sizes,  oil  properties,  pressures,  etc.  It  would  be 
desirable  to  pick  the  a's  such  that  a  desired  system 
response  is  obtained  and  economy  is  also  achieved. 

This  could  be  done  if  the  proper  variation  of  the  a's 
can  be  controlled.  However,  these  a's  are  not  yet 
known.  They  can  be  determined  by  measuring  the  responses 


of  real  systems.  From  these  responses,  empirical 
values  of  a's  can  be  obtained. 

3)  Similar  problems  abound  in  almost  all  process  plants, 
control  systems,  and  any  time  variant  system. 


IX 

Data  Structures 


The  concepts  of  data  structures  should  be  applied  to  most 
large  programs  on  digital  computers.  These  concepts  involve 
knowing  the  manners  in  which  data  are  stored  internally  in  digital 
computers . 

To  visualize  a  very  simple  example,  consider  a  Newton 
Raphson  algorithm  like 


-*■  -*•  -1  -* 

X  ,,  =  x  -  J  X 
n+l  n  n  n 

In  the  calculation  procedure,  the  usual  thing  to  do  would  be  to 

have  two  vectors  reserved  for  the  x  and  x  , .  .  V«hen  x  . .  is 

n  n+l  n+l 

calculated  and  convergence  has  not  been  achieved,  the  obvious 

#  ■+  t 
thing  to  do  would  be  to  store  the  in  the  space  used  for 

xnQ  on  the  last  iteration.  A  frightfully  more  efficient  scheme 

would  be  to  interchange  the  names  of  xn+^  and  xn. 

Ideas  of  this  type  are  easily  implemented  if  one  is  familiar 


with  data  storage  characteristics.  Such  methods  could  reduce 
execution  times  of  many  quasilinearization  programs  by  up  to  30%. 


Extensions 


The  extensions  that  can  be  made  to  the  groundwork  we  have 
laid  here  are  numerous . 

Typically: 

1)  Formulating  general  rules  governing  cost  functions. 

2)  Inclusion  of  norms  other  than  least  square  norms. 

All  other  norms  will  require  iterative  solutions 
within  each  iteration.  Is  this  expense  worth  it? 

3)  Development  of  high  accuracy  integration  schemes 
for  the  variable  coefficient  differential  equations. 
Problems  like  the  geopotential  problem  demand  it. 

4)  Use  of  symbol  manipulation  compilers  to  reduce  the 
effort  in  formulating  the  first  order  and  variable 
coefficient  differential  equations. 
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This  paper  is  a  tutorial  discussion  of  identification  problems. 

The  basic  principles  of  one  method  of  identification  are  discussed.  The 
method  of  quasilinearization  has  been  developed  by  Richard  Bellman  and 
Robert  Kalaba  and  others.  This  method  is  discussed  in  some  detail  with 
emphasis  on  programming  strategies.  These  strategies  are  intended  to 
improve  the  utility  of  a  general  quasi lineari zation  program.  Specific 
strategies  arc  concerned  with  expansion  of  convergence  space,  super¬ 
position  of  particular  solutions,  and  convergence  criteria. 
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